Non-vocal State N
نویسندگان
چکیده
本稿では,市販 CD等の歌声と伴奏を含む音楽音 響信号と歌詞の時間的対応付け手法について述べる. つまり,音楽音響信号と対応する歌詞のアラインメン トをとることで,歌詞の各フレーズの開始時刻と終 了時刻を推定する.本手法は,音楽ビデオのテロップ 自動作成や,歌詞を用いた頭出しなどに応用できる. 関連する先行研究として,Wangらが開発した LyricAlly [1]がある.彼らは,歌声の音韻的特徴を考慮 せず,歌詞中の各音素の持続長のみを用いて時間的対 応関係を推定していた.しかし,音素の持続長は楽曲 中の登場位置によって大きく異なるので,正確な対応 付けはできなかった. 本研究では,音声認識で用いられる強制アライン メントに基づき,歌声の音韻的特徴を用いて時間的 対応関係を推定する.しかし,現在の音声認識で用 いられるアラインメント手法は,背景音等を含まな いクリーンな話し声しか対象としていないので,歌 声と共に伴奏音が演奏されている場合や歌が歌われ ない間奏部が存在する場合の正確な対応付けの実現 が課題となる.この問題を解決するため,まず我々が 以前開発した伴奏音抑制 [2]を適用する.この手法で は,メロディの調波構造を抽出・再合成することで, 歌声を含むメロディのみを分離する.次に,図 1に 示した歌声・非歌声状態を行き来する隠れマルコフモ デル (HMM)に基づく歌声区間検出を用いて,実際 に歌声が存在する区間を検出する.最後に,強制アラ インメントを用いて,分離歌声と歌詞を対応づける. その際,音響モデルを特定歌手の分離歌声に適応さ せることも可能である.
منابع مشابه
Our Experience with Kashimas Procedure for Bilateral Abductor Vocal Cord Palsy
Introduction: Kashima operation, also known as endoscopic laser cordotomy is used for the treatment of bilateral abductor vocal cord palsy where the glottis chink is made posteriorly, sufficient enough for patient to breathe comfortably without any strider. Materials and Methods: This Clinical Trial Was Performed On 12 Patients[1] with Bilateral Abductor Vocal Cord Paralysis. All Patients...
متن کاملThe Effects of Size and Type of Vocal Fold Polyp on Some Acoustic Voice Parameters
Background: Vocal abuse and misuse would result in vocal fold polyp. Certain features define the extent of vocal folds polyp effects on voice acoustic parameters. The present study aimed to define the effects of polyp size on acoustic voice parameters, and compare these parameters in hemorrhagic and non-hemorrhagic polyps.Methods: In the present retrospective study, 28 individuals with hemorrha...
متن کاملVocal cord paralysis: What matters between idiopathic and non-idiopathic cases?
OBJECTIVES This study aims to evaluate the demographic and clinical characteristics of patients with idiopathic and non-idiopathic vocal cord paralysis (VCP). PATIENTS AND METHODS This retrospective cohort was performed on data extracted from medical files of 92 consecutive patients (43 males, 49 females; median age 52.1±23.1 years; min. 1 - max. 87) with VCP diagnosed in the otorhinolaryngol...
متن کاملEffects of Rheumatoid Arthritis on the Larynx
Introduction: The aim of the present study was to compare the videolaryngostroboscopic findings between patients with rheumatoid arthritis and vocally healthy controls. Materials and Methods: This case-control descriptive study was performed on 113 people, including 50 patients with rheumatoid arthritis and 63 controls. The participants were subjected to videolaryngost...
متن کاملHeterospecific alarm call recognition in a non-vocal reptile.
The ability to recognize and respond to the alarm calls of heterospecifics has previously been described only in species with vocal communication. Here we provide evidence that a non-vocal reptile, the Galápagos marine iguana (Amblyrhynchus cristatus), can eavesdrop on the alarm call of the Galápagos mockingbird (Nesomimus parvulus) and respond with anti-predator behaviour. Eavesdropping on com...
متن کاملVisual sensitivity to a conspicuous male cue varies by reproductive state in Physalaemus pustulosus females.
The vocal sac is a visually conspicuous attribute of most male frogs, but its role in visual communication has only been demonstrated recently in diurnally displaying frogs. Here we characterized the spectral properties of the inflated vocal sac of male túngara frogs (Physalaemus pustulosus), a nocturnal species, and túngara visual sensitivity to this cue across reproductive state and sex. We m...
متن کامل